Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change
نویسندگان
چکیده
Understanding how words change their meanings over time is key to models of language and cultural evolution, but historical data on meaning is scarce, making theories hard to develop and test. Word embeddings show promise as a diachronic tool, but have not been carefully evaluated. We develop a robust methodology for quantifying semantic change by evaluating word embeddings (PPMI, SVD, word2vec) against known historical changes. We then use this methodology to reveal statistical laws of semantic evolution. Using six historical corpora spanning four languages and two centuries, we propose two quantitative laws of semantic change: (i) the law of conformity—the rate of semantic change scales with an inverse power-law of word frequency; (ii) the law of innovation—independent of frequency, words that are more polysemous have higher rates of semantic change.
منابع مشابه
Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings
This paper introduces the concept of temporal word analogies: pairs of words which occupy the same semantic space at different points in time. One well-known property of word embeddings is that they are able to effectively model traditional word analogies (“word w1 is to word w2 as word w3 is to word w4”) through vector addition. Here, I show that temporal word analogies (“wordw1 at time tα is ...
متن کاملRecent Developments in Spanish (and Romance) Historical Semantics
Diachronic semantics has long been the stepchild of Spanish (and Romance) historical linguistics. Although many studies have examined (often in searching detail) the semantic evolution of individual lexical items, Hispanists have ignored broader patterns of semantic change and the relevant theoretical and methodological issues posed by this phenomenon. Working within the framework of cognitive ...
متن کاملA State-of-the-Art of Semantic Change Computation
This paper reviews state-of-the-art of one emerging field in computational linguistics — semantic change computation, proposing a framework that summarizes the literature by identifying and expounding five essential components in the field: diachronic corpus, diachronic word sense characterization, change modelling, evaluation data and data visualization. Despite the potential of the field, the...
متن کاملVerbs Change More than Nouns: a Bottom-up Computational Approach to Semantic Change
Linguists have identified a number of types of recurrent semantic change, and have proposed a number of explanations, usually based on specific lexical items. This paper takes a different approach, by using a distributional semantic model to identify and quantify semantic change across an entire lexicon in a completely bottom-up fashion, and by examining which distributional properties of words...
متن کاملTowards Tracking Semantic Change by Visual Analytics
This paper presents a new approach to detecting and tracking changes in word meaning by visually modeling and representing diachronic development in word contexts. Previous studies have shown that computational models are capable of clustering and disambiguating senses, a more recent trend investigates whether changes in word meaning can be tracked by automatic methods. The aim of our study is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1605.09096 شماره
صفحات -
تاریخ انتشار 2016